fix(llmobs): filter openai Omit/NotGiven sentinels from span metadata by jessicagamio · Pull Request #18552 · DataDog/dd-trace-py

jessicagamio · 2026-06-09T23:38:40Z

Overview

The OpenAI integration captured the raw chat-completion request kwargs as LLM span metadata, including the openai SDK's Omit / NotGiven sentinel objects used as defaults for unset parameters. These were serialized to noisy repr strings such as "<openai.Omit object at 0x7f5e35900e90>" across most of the ~21 metadata keys per span, making the metadata field unqueryable and burying the real parameters.

Motivation

Frameworks like PydanticAI forward every chat-completion parameter explicitly, defaulting any the caller didn't set to openai.omit. ddtrace snapshots the kwargs at the wrapper boundary — upstream of the openai SDK's own sentinel-stripping (_merge_mappings) — so the sentinels reach span metadata even though they never reach the provider. The request to OpenAI/Azure was always correct; only the recorded metadata was polluted.

Change

Filter Omit / NotGiven sentinel values out before building metadata, in both:

get_metadata_from_kwargs (chat / completion)
openai_get_metadata_from_response (responses API)

Sentinel types are resolved lazily and independently via a small cached helper. Lazy resolution avoids a circular import while ddtrace is patching openai at import time; independent resolution keeps NotGiven filtering working on openai<2 (which has no Omit). On openai-less installs the helper returns an empty tuple and the filter is a no-op.

Testing

Added regression test test_chat_completion_filters_openai_sentinel_metadata — passes a real value (top_p) alongside Omit/NotGiven sentinels and asserts only the real value lands in metadata. Fails without the fix.
Verified across the full openai riot matrix (latest, <2.0.0, ~=1.76.2, ==1.66.0) — including 1.x, which exercises the NotGiven-only path.
Reproduced the customer's exact stack (PydanticAI → Azure OpenAI): metadata drops from 21 keys (~20 sentinels) to only real values.

Risk

Low. The change only removes sentinel placeholder values from metadata; real parameter values are unaffected. Behavior is unchanged for non-openai integrations.

Jira: MLOB-7613

datadog-datadog-prod-us1 · 2026-06-09T23:39:10Z

Tests

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 8 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-py | build linux serverless: [amd64, cp315-cp315, v113741238-d2b8243-manylinux2014_x86_64, 1]

DataDog/apm-reliability/dd-trace-py | build linux serverless: [amd64, cp315-cp315, v113741491-d2b8243-musllinux_1_2_x86_64, 1]

DataDog/apm-reliability/dd-trace-py | build linux serverless: [arm64, cp315-cp315, v113741357-d2b8243-manylinux2014_aarch64, 1]

View all 8 failed jobs.

ℹ️ Info

No other issues found (see more)

🧪 All tests passed
❄️ No new flaky tests detected

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 268c453 | Docs | Datadog PR Page | Give us feedback!}

cit-pr-commenter-54b7da · 2026-06-09T23:41:19Z

Codeowners resolved as

ddtrace/llmobs/_integrations/utils.py                                   @DataDog/ml-observability
releasenotes/notes/llmobs-filter-openai-omit-metadata-15533386303440c7.yaml  @DataDog/apm-python
tests/contrib/openai/test_openai_llmobs.py                              @DataDog/ml-observability

emmettbutler

release note looks fine

Yun-Kim

I'm fine with what we're fixing, just wonder if there's an easier way to do this

The OpenAI integration captured the raw request kwargs as span metadata, including the openai SDK's `Omit`/`NotGiven` sentinel objects used as defaults for unset parameters. These were serialized to noisy repr strings such as "<openai.Omit object at 0x...>", making metadata unqueryable and burying the real parameters. Frameworks like PydanticAI forward every chat-completion parameter explicitly, defaulting unset ones to `openai.omit`, which is why this surfaced broadly. Filter both sentinel types out before building metadata, in both `get_metadata_from_kwargs` (chat/completion) and `openai_get_metadata_from_response` (responses API). Sentinel types are resolved lazily and independently to avoid a circular import at patch time and to keep `NotGiven` filtering working on openai<2 (which has no `Omit`).

…303440c7.yaml Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>

github-actions · 2026-06-11T22:56:03Z

This change is marked for backport to 4.10 and it does not conflict with that branch.
The command used to test backporting was

git fetch origin 4.10 && git checkout origin/4.10 && git checkout -b backport--to-4.10 && git cherry-pick -x --mainline 1 faf70d11d6ddc328e190255f7d47a9601d7878b5

…#18552) ## Overview The OpenAI integration captured the raw chat-completion request `kwargs` as LLM span metadata, including the openai SDK's `Omit` / `NotGiven` sentinel objects used as defaults for unset parameters. These were serialized to noisy repr strings such as `"<openai.Omit object at 0x7f5e35900e90>"` across most of the ~21 metadata keys per span, making the metadata field unqueryable and burying the real parameters. ## Motivation Frameworks like **PydanticAI** forward every chat-completion parameter explicitly, defaulting any the caller didn't set to `openai.omit`. ddtrace snapshots the `kwargs` at the wrapper boundary — upstream of the openai SDK's own sentinel-stripping (`_merge_mappings`) — so the sentinels reach span metadata even though they never reach the provider. The request to OpenAI/Azure was always correct; only the recorded metadata was polluted. ## Change Filter `Omit` / `NotGiven` sentinel values out before building metadata, in both: - `get_metadata_from_kwargs` (chat / completion) - `openai_get_metadata_from_response` (responses API) Sentinel types are resolved **lazily and independently** via a small cached helper. Lazy resolution avoids a circular import while ddtrace is patching openai at import time; independent resolution keeps `NotGiven` filtering working on `openai<2` (which has no `Omit`). On openai-less installs the helper returns an empty tuple and the filter is a no-op. ## Testing - Added regression test `test_chat_completion_filters_openai_sentinel_metadata` — passes a real value (`top_p`) alongside `Omit`/`NotGiven` sentinels and asserts only the real value lands in metadata. Fails without the fix. - Verified across the full openai riot matrix (latest, `<2.0.0`, `~=1.76.2`, `==1.66.0`) — including 1.x, which exercises the `NotGiven`-only path. - Reproduced the customer's exact stack (PydanticAI → Azure OpenAI): metadata drops from 21 keys (~20 sentinels) to only real values. ## Risk Low. The change only removes sentinel placeholder values from metadata; real parameter values are unaffected. Behavior is unchanged for non-openai integrations. Jira: MLOB-7613 Co-authored-by: jessica.gamio <jessica.gamio@datadoghq.com> (cherry picked from commit faf70d1) Co-authored-by: Jessica Gamio <52049720+jessicagamio@users.noreply.github.com>

… [backport 4.10] (#18592) Backport #18552 to 4.10 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jessica Gamio <52049720+jessicagamio@users.noreply.github.com>

github-actions · 2026-06-12T16:50:47Z

This change is marked for backport to 4.11 and it does not conflict with that branch.
The command used to test backporting was

git fetch origin 4.11 && git checkout origin/4.11 && git checkout -b backport--to-4.11 && git cherry-pick -x --mainline 1 faf70d11d6ddc328e190255f7d47a9601d7878b5

…#18552) ## Overview The OpenAI integration captured the raw chat-completion request `kwargs` as LLM span metadata, including the openai SDK's `Omit` / `NotGiven` sentinel objects used as defaults for unset parameters. These were serialized to noisy repr strings such as `"<openai.Omit object at 0x7f5e35900e90>"` across most of the ~21 metadata keys per span, making the metadata field unqueryable and burying the real parameters. ## Motivation Frameworks like **PydanticAI** forward every chat-completion parameter explicitly, defaulting any the caller didn't set to `openai.omit`. ddtrace snapshots the `kwargs` at the wrapper boundary — upstream of the openai SDK's own sentinel-stripping (`_merge_mappings`) — so the sentinels reach span metadata even though they never reach the provider. The request to OpenAI/Azure was always correct; only the recorded metadata was polluted. ## Change Filter `Omit` / `NotGiven` sentinel values out before building metadata, in both: - `get_metadata_from_kwargs` (chat / completion) - `openai_get_metadata_from_response` (responses API) Sentinel types are resolved **lazily and independently** via a small cached helper. Lazy resolution avoids a circular import while ddtrace is patching openai at import time; independent resolution keeps `NotGiven` filtering working on `openai<2` (which has no `Omit`). On openai-less installs the helper returns an empty tuple and the filter is a no-op. ## Testing - Added regression test `test_chat_completion_filters_openai_sentinel_metadata` — passes a real value (`top_p`) alongside `Omit`/`NotGiven` sentinels and asserts only the real value lands in metadata. Fails without the fix. - Verified across the full openai riot matrix (latest, `<2.0.0`, `~=1.76.2`, `==1.66.0`) — including 1.x, which exercises the `NotGiven`-only path. - Reproduced the customer's exact stack (PydanticAI → Azure OpenAI): metadata drops from 21 keys (~20 sentinels) to only real values. ## Risk Low. The change only removes sentinel placeholder values from metadata; real parameter values are unaffected. Behavior is unchanged for non-openai integrations. Jira: MLOB-7613 Co-authored-by: jessica.gamio <jessica.gamio@datadoghq.com> (cherry picked from commit faf70d1) Co-authored-by: Jessica Gamio <52049720+jessicagamio@users.noreply.github.com>

… [backport 4.11] (#18604) Backport #18552 to 4.11 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jessica Gamio <52049720+jessicagamio@users.noreply.github.com>

jessicagamio requested review from a team as code owners June 9, 2026 23:38

jessicagamio requested review from avara1986 and r1viollet June 9, 2026 23:38

jessicagamio force-pushed the MLOS-693/filter-openai-omit-metadata branch 2 times, most recently from f04d054 to df1efab Compare June 10, 2026 03:00

emmettbutler approved these changes Jun 10, 2026

View reviewed changes

Yun-Kim reviewed Jun 10, 2026

View reviewed changes

Comment thread ddtrace/llmobs/_integrations/utils.py Outdated

jessicagamio force-pushed the MLOS-693/filter-openai-omit-metadata branch from df1efab to 0bf1ef6 Compare June 10, 2026 21:55

Merge branch 'main' into MLOS-693/filter-openai-omit-metadata

1130aa5

Yun-Kim approved these changes Jun 11, 2026

View reviewed changes

Comment thread releasenotes/notes/llmobs-filter-openai-omit-metadata-15533386303440c7.yaml Outdated

Update releasenotes/notes/llmobs-filter-openai-omit-metadata-15533386…

268c453

…303440c7.yaml Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>

gh-worker-dd-mergequeue-cf854d Bot merged commit faf70d1 into main Jun 11, 2026
584 checks passed

gh-worker-dd-mergequeue-cf854d Bot deleted the MLOS-693/filter-openai-omit-metadata branch June 11, 2026 22:01

jessicagamio added the backport 4.10 label Jun 11, 2026

dd-octo-sts Bot mentioned this pull request Jun 11, 2026

fix(llmobs): filter openai Omit/NotGiven sentinels from span metadata [backport 4.10] #18592

Merged

jessicagamio added the backport 4.11 label Jun 12, 2026

dd-octo-sts Bot mentioned this pull request Jun 12, 2026

fix(llmobs): filter openai Omit/NotGiven sentinels from span metadata [backport 4.11] #18604

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llmobs): filter openai Omit/NotGiven sentinels from span metadata#18552

fix(llmobs): filter openai Omit/NotGiven sentinels from span metadata#18552
gh-worker-dd-mergequeue-cf854d[bot] merged 3 commits into
mainfrom
MLOS-693/filter-openai-omit-metadata

jessicagamio commented Jun 9, 2026

Uh oh!

datadog-datadog-prod-us1 Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

cit-pr-commenter-54b7da Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

emmettbutler left a comment

Uh oh!

Yun-Kim left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jessicagamio commented Jun 9, 2026

Overview

Motivation

Change

Testing

Risk

Uh oh!

datadog-datadog-prod-us1 Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

ℹ️ Info

Uh oh!

cit-pr-commenter-54b7da Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codeowners resolved as

Uh oh!

emmettbutler left a comment

Choose a reason for hiding this comment

Uh oh!

Yun-Kim left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

datadog-datadog-prod-us1 Bot commented Jun 9, 2026 •

edited

Loading

cit-pr-commenter-54b7da Bot commented Jun 9, 2026 •

edited

Loading